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We propose a simple continuous time model for modeling the lead-lag effect between two financial 
assets. A two-dimensional process (Xt,Yt) reproduces a lead-lag effect if, for some time shift 
$ £ E, the process (Xt,Yt+&) is a semi-martingale with respect to a certain filtration. The 
value of the time shift ■& is the lead-lag parameter. Depending on the underlying filtration, 
the standard no-arbitrage case is obtained for -0 = 0. We study the problem of estimating the 
unknown parameter i? £ R, given randomly sampled non-synchronous data from (Xt) and (Y t ). 
By applying a certain contrast optimization based on a modified version of the Hayashi-Yoshida 
covariation estimator, we obtain a consistent estimator of the lead-lag parameter, together with 
an explicit rate of convergence governed by the sparsity of the sampling design. 

Keywords: contrast estimation; discretely observed continuous-time processes; 
Hayashi-Yoshida covariation estimator; lead-lag effect 

1. Introduction 

Market participants usually agree that certain pairs of assets (A, Y) share a "lead-lag 
effect, " in the sense that the lagger (or follower) price process Y tends to partially 
reproduce the oscillations of the leader (or driver) price process A, with some temporal 
delay, or vice-versa. This property is usually referred to as the "lead-lag effect." The 
lead-lag effect may have some importance in practice, when assessing the quality of risk 
management indicators, for instance, or, more generally, when considering statistical 
arbitrage strategies. Also, note that it can be measured at various temporal scales (daily, 
hourly or even at the level of seconds, for flow products traded on electronic markets). 

The lead-lag effect is a concept of common practice that has some history in financial 
econometrics. In time series for instance, this notion can be linked to the concept of 
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Granger causality, and we refer to Comte and Renault [4] for a general approach. From 
a phenomcnological perspective, the lead-lag effect is supported by empirical evidence 
reported in [3, 6] and [18], together with [20] and the references therein. To our knowledge, 
however, only few mathematical results are available from the point of view of statistical 
estimation from discretely observed, continuous-time processes. The purpose of this paper 
is to - partly - fill in this gap. (Also, recently, Robert and Rosenbaum study in [23] the 
lead-lag effect by means of random matrices, in a mixed asymptotic framework, a setting 
which is relatively different than in the present paper.) 

1.1. Motivation 

(1) Our primary goal is to provide a simple - yet relatively general - model for captur- 
ing the lead-lag effect in continuous time, readily compatible with stochastic calculus in 
financial modeling. Informally, if T_^(Y") t := it+#, with $ G R, is the time-shift operator, 
we say that the pair (X,Y) will produce a lead-lag effect as soon as (X, r_^(y)) is a 
(regular) semi-martingale with respect to an appropriate filtration, for some $, called 
the lead-lag parameter. The usual no-arbitrage case is embedded into this framework for 
$ = 0. More in Section 2 below. 

(2) At a similar level of importance, we aim at constructing a simple and efficient pro- 
cedure for estimating the lead-lag parameter i? based on historical data. The underlying 
statistical model is generated by a - possibly random - sampling of both X and Y. The 
sampling typically happens at irregularly and non-synchronous times for X and Y. Wc 
construct, in the paper, an estimator of $ based on a modification of the Hayashi- Yoshida 
covariation estimator; see [11] and [13]. Our result is that the lead-lag parameter can be 
consistently estimated against a fairly general class of sampling schemes. Moreover, we 
explicit the rate of convergence of our procedure. 

(3) From a financial point of view, unless appropriate time shifts are operated, our 
model incapacitates our primary assets X and Y to be a semi-martingale with respect 
to the same filtration. This is consistent, as far as modeling is concerned, but allows, in 
principle, for market imperfections such as statistical arbitrage if the lead-lag parameter 
i9 is different from zero. More in Section 3.4 below. Addressing such a possibility is indeed 
the issue of the lead- lag effect, but we will content ourselves with detecting whether the 
lead-lag effect is present or not. The quantization of statistical arbitrage in terms of i? 
(and other parameters such as trading frequency, market friction, volatility and so on) 
lies beyond the scope of this paper. 

(4) From a statistical inference point of view, the statistician and the data provider 
are not necessarily the same agents, and this leads to technical difficulties linked to the 
sampling strategy. The data provider may choose the opening/closing for X and Y, pos- 
sibly traded on different markets, possibly on different time clocks. He or she may also 
sample points at certain trading times or events which are randomly chosen in a partic- 
ular time window. This typically happens if daily data are considered. At a completely 
different level, if high-frequency data are concerned, trading times are genuinely ran- 
dom and non-synchronous. Our approach will simultaneously incorporate these different 
points of view. 
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1.2. Organization of the paper 

In Section 2, we present our stochastic model for describing the lead-lag effect. We start 
with the simplest Bachclicr model with no drift in Section 2.1. The issue boils down 
to defining properly the lead-lag effect between two correlated Brownian motions. In 
Section 2.2, a general lead-lag model is presented for two-dimensional process, for which 
the marginal processes are semi-martingales with locally bounded drift and continuous 
local martingale part, with properly defined diffusion coefficients. 

We present our main result in Section 3. Section 3.1 gives a precise construction of 
the underlying statistical experiment with the corresponding assumptions on the obser- 
vation sampling schemes. The estimation procedure is constructed in Section 3.2, via an 
appropriate contrast function based on the covariation between X and Y when one asset 
is artificially shifted in time, the amount of this shift being the argument of the contrast 
function. Our estimator is robust to non-synchronous data and does not require any pre- 
processing contrary to the previous tick algorithm; sec, for example, [27]. In Section 3.3, 
we state our main result in Theorem 1 : we show that the lead-lag parameter between X 
and Y can be consistently estimated from non-synchronous historical data over a fixed 
time horizon [0,T]. The rate is governed by A„, the maximal distance between two data 
points. We show that the rate of convergence of our estimator is essentially A" 1 and not 

— 1/2 

A n , as one would expect from a regular estimation problem in diffusion processes; 
see, for example, [7]. This comes from the underlying structure of the statistical model, 
which is not regular, and which shares some analogy with change-point problems. As for 
our procedure, we investigate further its asymptotic properties in Proposition 1 when we 
confine ourselves to the simpler case where X and Y are marginally Brownian motions 
that are observed at synchronous data points. In that case, we can exhibit a central limit 
theorem for our contrast function. A closer inspection of the limiting variance reveals the 
effect of the correlation between the two assets, which also plays a role in the accuracy of 
the estimation procedure. Finally, we show in Proposition 2 that a simple central limit 
theorem cannot hold for our estimator. We discuss this effect which is somewhat linked 
to the discretisation of our method. 

Theorem 1 is good news, as far as practical implementation is concerned, and is further 
addressed in the discussion in Section 3.4, appended with numerical illustrations on 
simulated data in Section 5 and on real data in Section 6. The proofs are delayed until 
Section 4 and the Appendix contains auxiliary technical results. 

2. The lead-lag model 
2.1. The Bachelier model 

A simple lead-lag Bachelier model with no drift between two Brownian motion compo- 
nents can be described as follows. On a filtered space (O, J 7 , F = (J r t)t>o,I I> ), we consider 
a two-dimensional F-Brownian motion B = (B 1 - 1 ^, B^) such that (B^ 1 ', B^) t = pt for 
every t > and for some p G [—1,1]. Let T > be some terminal time, fixed throughout 
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Xt-xo + o-iB^, 



Y t :=y a +<j 2 B[ 2) 



where xo, yo £ R and cri > 0, a 2 > are given constants. The corresponding Black-Scholcs 
version of this model is readily^obtained by exponentiating X and Y . We introduce a 
lead-lag effect between X and Y by operating a time shift: let # £ R represent the lead 
or lag time between X and Y (and assume for simplicity that i? > 0). Put 

r*(r) t :=y t -*, t€[i?,T]. (1) 
Our lead-lag model is the two-dimensional process 

{X,T#{Y)) = {X U To{Y) t ) mi)>T] . 

Since we have = pBj. 1 ' + (1 — /9 2 ) 1//2 Wt with VP = (W t )t£[o,T], a Brownian motion 
independent of B^ x \ we obtain the simple and explicit representation 

(Xt^xo + a^, 

1 r„(F) t - + +(r 2 (l - P 2 ) 1/2 W^ 

for i G In this representation, the interpretation of the lead-lag parameter $ is 

transparent. Alternatively, if we start with a process (X,Y) having representation 

{X,Y) = {X,t^Y)) (3) 

as in (2), the lead-lag interpretation between X and Y readily follows. Since d > 0, the 
sample path of X anticipates on the path of Y by a time shift -d and to an amount - 
measured in normalized standard deviation - proportional to po 2 /oi . In that case, we 
say that X is the leader, and Y is the lagger. For the case $ < 0, we intertwine the roles 
of X and Y in the terminology. 

Remark 1. Note that, except in the case d = 0, the process (X t , Y t )t^[o,T] is n °t an 
F-martingale. However, each component js a martingale with respect to a different filtra- 
tion: X is an F-martingalc and Y = T$(Y) is an F -martingale, with W 6 = (J-f)t>$, and 



2.2. Lead-lag between two semi-martingales 

We generalize the lead-lag model (3) to semi-martingales with local martingale compo- 
nents that can be represented as Ito local martingales. 

We need some notation. Let T > be some terminal time, and let 5 > represent 
the maximum temporal lead-lag allowed for the model, fixed throughout the paper. On 
a probability space (Cl,J-,P), let F= (J-t)te[-s,T+s] be a filtration satisfying the usual 
conditions. We denote by F[ a b ] = (Ft)te[a,b] the restriction of F to the time interval [a, b]. 
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Definition 1. The two-dimensional process (X, Y)te[o,T+s\ * s a regular semi-martingale 
with lead-lag parameter d G [0, 5) if the following decomposition holds: 

X = X C + A, Y = Y C + B, 

with the following properties: 

• The process (X t c ) tg [ ,T+(5] *s a continuous F[ 0j t+<5] -local martingale, and the process 
(Yt c )te[o.T+s] * s a continuous Wf T+s ^ -local martingale. 

• The quadratic variations (X c ) t ^ T+S] an d (Y c )te[o,T+8] are absolutely continuous 
w.r.t. the Lebesgue measure, and their Radon-Nikodym derivatives admit a locally 
bounded version. 

• The drifts A and B have finite variation over [0, T + 8] . 

Definition 2. The two-dimensional process (X, Y)t£\o,T+8] * s a regular semi-martingale 
with lead-lag parameter # G (— 5,0] if the same properties as in Definition 1 hold, with X 
and Y intertwined and $ replaced by 

Remark 2. If (X,Y) te [ T+ ^ is a regular semi-martingale with lead-lag parameter $ G 
[0,(5), then the process (r_^(K c )) tg [_^ y] is a continuous F[_^ ^-local martingale, with 
T-$ (Y)t = Y t+ d the (inverse of the) shift operator defined in (1). 

Remark 3. If (X, Y) te mx+S\ is a regular semi-martingale with lead-lag parameter 
$ G [0,(5), then the process (Y, X) tG [ T+s] is a regular semi-martingale with lead-lag 
parameter 

3. Main result 

3.1. The statistical model 

We observe a two-dimensional price process (X, Y) at discrete times. The components 
X and Y are observed over the time horizon [0,T + 5]. The following assumption is in 
force throughout: 

Assumption A. The process (X,Y) — (X t , Y t ) t £[o.T+s] is a regular semi-martingale 
with lead-lag parameter O G "d = (—5,5). 

The - possibly random - observation times are given by the following subdivisions of 
[0,T+5]: 

T X := {si.m < s 2 . ni < • ■ • < s ni . ni } (4) 

for X and 

T Y := {il,n 2 < ^2,n 2 < • ' • < t n2 ,n 2 } (5) 

for Y, with n\ = n2 or not. For simplicity, we assume si !ril = t\ tTl2 = and s ni . ni = 
tn 2 .n 2 = T + 5. The sample points are either chosen by the statistician or dictated for 
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practical convenience by the data provider. They are usually neither equispaced in time 
nor synchronous, and may depend on the values of X and Y. 

For some unknown ■& G := (—6,5), the process (X,Y) is a regular semi-martingalc 
with lead- lag parameter and we want to estimate $ based on the set of historical data 

{X s ,seT x }u{Y t ,teT Y }. (6) 

In order to describe precisely the property of the sampling scheme T L)T Y , we need some 
notation that we borrow from Hayashi and Yoshida [11]. The subdivision T x introduced 
in (4) is mapped into a family of intervals 

1={I= (1,1] = (s i)ni ,s i+ i iTll ],£= l,...,m - 1}. (7) 

Likewise, the subdivision T Y defined in (5) is mapped into 

J = {J=(£J} = {tj,nv,Sj+\, n2 \,j = 1, . . .,n 2 - 1}. 

We will systematically employ the notation I (resp., J) for an element of I (resp., J). 
We set 

A„ :=max{sup{|I|,JeI},sup{|J|, JgJ}}, 

where \I\ (resp., |J|) denotes the length of the interval / (resp., J), and n is a parameter 
tending to infinity. 

Remark 4- One may think of n being the number of data points extracted from the 
sampling, that is, n — %L + However, as we will see, only the (random) quantity A„ 
will prove relevant for measuring the accuracy of estimation of the lead-lag parameter. 

The assumptions on the sampling scheme is the following. 

Assumption B. 

Bl. There exists a deterministic sequence of positive numbers v n such that v n < 5 and 
v n — >• as n — > oo . Moreover 

in probability as n — > oo. 

B2. For all J6l, the random times J and I are F"™ -stopping times if $ > (resp., 
¥ +Vn -stopping times if $ < 0). For all J G J , the random times J_ and J are 
jfi9+u„ -stopping times if$>0 (resp., ¥ Vn -stopping times if d < 0). 

B3. There exists a finite grid Q n C such that G Q n and 

- For some 7 > 0, we have %Q n = 0(w~ 7 ). 

- For some deterministic sequence p n > 0, we have 

(J [$-p n ,# + Pn ]De 
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and 

lim p n min{E[ttZ],E[ttJ]}->0. 

n— >oo 

Remark 5. Since both E[jJZ] and E[)JJ7"] diverge at rate no less than w" 1 , Assumption B3 
implies that p n = o(v n ). With no loss of generality, we thus may (and will) assume that 
Pn _ v n for all n. 

3.2. The estimation procedure 

Preliminaries 

Assume first that the data arrive at regular and synchronous time stamps over the time 
interval [0,T] = [0, 1], with A„ = 1/n for simplicity. This means that we have 2n + 2 
observations 

(X ,Y ), (Xy n , Y\jn), (X 2 / n ,Y 2 / n ), . . . , (Xi, Yi). 
For every integer k € Z, we form the shifted time series 

Y{k+i)/m 1 = 1,2,... 

for every i such that (k + i)/n is an admissible time stamp 1 . We can then construct the 
empirical covariation estimator 

C n (k) := y^Xj/n - X( l _ 1 y n )(Y( l+k y n — F(j + fc_i)/„), 

i 

where the sum in i expands over all relevant data points. Over the time interval [0, 1], the 
number of elements used for the computation of C n (k) should be of order n as n — > oo. 
Assume further for simplicity that the process (X, Y) is a lead-lag Bachelier model in 
the sense of Section 2.1, with lead-lag parameter d = i3 n = k^/n, with an integer. On 
the one hand, for k = fc° , we have the decomposition 

C (k°) -T (1) +T (2) 

with 

i 

i 

Computing successively the fourth-order moment of the random variables T„ — po\a-2 

(2) 

and Tn and applying Markov's inequality and the Borel-Cantelli lemma, elementary 

1 Possibly, we end up with an empty data set. 
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computations show that T„ — > po~\o~i and T„ — >• as n — > oo almost surely, and we 
derive 

C n (k n ) — > po-\o~2 as n — > oo almost surely. 
On the other hand, for A: ^ fc° , we have 

with 

= PO-l^^^i/r. ~ B (i--L)/n)( B (i+k-k°„)/n ~ ■ B (i + fe-feO_i) /n ), 

?i 2) = Vl-pV 1( 7 2 ^(Bgi - fl ( ( ^ 1)/n )(W(i+*-*o)/„ - WW_ fe o _ 1)/n ). 

Thus, for fixed n and k > the process 

i 

j ~^ X]^ 1 /™ _ ^(i-l)/n)^X(i+k)/n - Y(i+k-l)/n) 
i=l 

is (J 7 (j + fe_feo )/ ra ) ;( >i-martingale. Consequently, using the Burkholder-Davis-Gundy in- 
equality, we easily obtain that 

E[C„(fc) 6 ]<cn- 3 , 

up to some constant c > 0. The same result holds for k < We infer 



E 



sup \C n (k) 

k^k° 



< cn~ 2 



up to a modification of c. Using again Markov's inequality and the Borel-Cantelli lemma, 
we finally obtain that 

sup \C n (k)\ — > as n — > oo almost surely. 

Therefore, provided per \<J2 ^ 0, we can detect asymptotically the value that defines t) 
in the very special case $ — A„, using fc° defined as one maximizer in k of the contrast 
sequence 

k \C n (k)\. 

Indeed, from the preceding computations, we have 

Almost surely, for large enough n, = (8) 

This is the essence of our method. For an arbitrary #, we can anticipate that an approx- 
imation of $ taking the form fc„A„ would add an extra error term of the order of the 
approximation, that is, A„, which is a first guess for an achievable rate of convergence. 
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In a general context of regular semi- martingales with lead- lag effect, sampled at random 
non-synchronous data points, we consider the Hayashi-Yoshida (later abbreviated by HY) 
covariation estimator and modify it with an appropriate time shift on one component. 
We maximize the resulting empirical covariation estimator with respect to the time shift 
over an appropriate grid. 

Construction of the estimator 

We need some notation. If H = (H_, H] is an interval, for ^ G 8, we define the shift interval 
H 6 :=H + >&=(H + ti,H + tf}. We write 



X(H) t := f l H (s)dX a 
Jo 



for a (possibly random) interval, such that s ~-» l H (s) is an elementary predictable pro- 
cess. Also, for notational simplicity, we will often use the abbreviation 



X(H):=X{H) T+S = I + l H (s)dX. 
Jo 



The shifted HY covariation contrast is defined as the function 

■= h>o E X(I)Y(J)l {InJ ^ 0} 
iei,JeJ,i<T 

E x d)y{j)i 

iei,JeJ,l<T 

Our estimator is obtained by maximizing the contrast i? ~~> \U n {$) | over the finite grid 
Q n constructed in Assumption B3 in Section 3.1 above. Eventually, d n is defined as a 
solution of 

\U n {d n )\ = mzJL\U n { : d)\. (9) 



3.3. Convergence results 

Since r_#(Y c ) is a F-local martingale, the quadratic variation process (X c ,t_^(Y c )} is 
well defined. We are now ready to assess our main result: 

Theorem 1. Work under Assumptions A and B. The estimator $ n defined in (9) sat- 
isfies 

in probability, on the event {(X c , t-${Y c ))t ^ 0}, as n — > oo. 
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Theorem 1 provides a rate of convergence for our estimator: the accuracy A" 1 is 
nearly achievable, to within arbitrary accuracy. The next logical step is the availability 
of a central limit theorem. In the general case, this is not straightforward. We may, 
however, be more accurate if we further restrict ourselves to synchronous data in the 
Bachelier case; that is, we have data 

(X , Y ), (X An , Y An ), (X 2An , F 2A J, ... (10) 

over the time interval [0,T], and the process (X,Y) admits representation (3). We can 
then exhibit the asymptotic behavior of the contrast function $ ~->£Y"(i9), in a vicinity 
of size A„, of the lead-lag parameter. More precisely, we have the following proposition. 

Proposition 1. Let ip(t) = (1 — |t|)l|t|<i denote the usual hat function. Let us consider 
the Bachelier model (3) and a synchonous observation sampling scheme (10), with lead- 
lag parameter i?68. If \& — ??| < A n , we have 



U n {d) = cnaziTprtA-^d - 6)) + T^A^y/l + pM^n 1 ^ - If))?), 

where £ n is a sequence of random variables that converge in distribution to the standard 
Gaussian law A/"(0, 1) as n — > oo. 

This representation is useful to understand the behavior of the contrast function £Y„(i9): 
up to a scaling factor, \U n {d)\ is asymptotically proportional to the realization of the 
absolute value of Gaussian random variable |A/"(m„($),a„(z9) 2 )|, with 



m n {d) = Tp V {A- 1 ($-•»)) and a n (tf) = T 1 / 2 A 1 T / 2 ^1 + p 2 ip(An 1 (& - #)) 

which has asymptotic value m n (i9) as soon as the mean dominates the standard deviation. 
We then have 

^l + P ^{A^ {§-§)) 
and this is the case if \d — "d\ < A„; otherwise, the pike p(p(A~ 1 (d — $)) degenerates 

1/2 

toward 0, and the contrast behaves like a non-informative A n |AA(0,1)| up to a mul- 
tiplicative constant. It is noteworthy that Proposition 1 reveals the influence of the 

correlation p in the estimation procedure. We see that if p is too small, namely of order 
l li 

An , the same kind of degeneracy phenomenon occurs: we do not have the divergence 
m n(^)/on(^) - ► oo anymore, and both mean and standard deviation arc of the same 
order; in that latter case, maximizing |W n ($)| does not locate the true value 

The situation is a bit more involved when looking further for the next logical step, that 
is, a limit theorem for d n € argmax^ g g„ \U n {'d)\. The function ■& ~> U n ('d) is not smooth, 
even asymptotically: up to normalizing by A" 1 , •& -w Lp{A~ 1 {d — #)) weakly converges 
to a Dirac mass at point see Proposition 1. In that case, it becomes impossible, in 
general, to derive a simple central limit theorem for $ n . Consider again the synchronous 
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case over [0,T] = [0, 1], and pick a regular grid Q n with mesh h n such that h n A~ x goes 
to zero. In this situation, the contrast function is constant over all the points belonging 
to one given interval of the form (iA n , (i + 1)A„), for ieZ. For definitencss and without 
loss of generality, we set 

d n = min[tf n , tf„ G argmax \U n (tf ) | }. 

From Theorem 1, we know that u~ 1 (i9 n — $) goes to zero for any sequence v n such that 
u~ 1 A n — > 0; therefore, we look for the behavior of the normalized error, with rate A" 1 . 
However, the following negative result shows that this cannot happen. 

Proposition 2. Under the preceding assumptions, there is no random variable Z such 
that A~ 1 (i9„ — $) converges in distribution to Z . 

The proof is given in the Appendix. Proposition 2 stems from the fact that part of 
the error of is given by the difference between ■& and its approximation on the grid 
Q n , This error is deterministic and cannot be controlled at the accuracy level A„; see 
the proof in the Appendix. This phenomenon is somehow illustrated in the simulation in 
Section 5. Note that this negative result is not in contradiction to result (8) which states 
that almost surely, for large enough n, t9„ = Indeed, result (8) is obtained considering 
a grid with mesh A n and a very special sequence of models where •& is of the form 
$ = $ n = A„, with an integer. In the case where $ does not depend on n, one can, 
of course, extend the almost sure result (8). However, what can be obtained is essentially 
that almost surely, for large enough n, ■& € ($ n — A„,^„ + A„). Therefore, we almost 
surely identify the interval of size 2A n in which $ lies, but our method does not enable 
us to say something more accurate. 

3.4. Discussion 

Covariation estimation of non- synchronous data 

The estimation of the covariation between two semi-martingales from discrete data from 
non- synchronous observation times has some history. It was first introduced by Hayashi 
and Yoshida [11] and subsequently studied in various related contexts by several authors. 
A comprehensive list of references include: Malliavin and Mancino [19], Hayashi and 
Yoshida [10-14], Hayashi and Kusuoka [9], Ubukata and Oya [26], Hoshikawa et al. [15] 
and Dalalyan and Yoshida [5]. 

About the rate of convergence 

The condition A„ = o(v n ) of Assumption Bl is needed for technical reasons, in order to 
manage the fact that A„ is random in general. In the case of regular sampling A„ = n^ 1 
with T=l, the nearly obtained rate A„ = n^ 1 is substantially better than the usual 
n _1 / 2 -rate of a regular parametric statistical model. This is due to the fact that the 
estimation of the lead-lag parameter is rather a change-point detection problem; see [16] 
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for a general reference for the structure of parametric models. A more detailed analysis 
of the contrast function shows that its limit is not regular (not diffcrentiablc in the in- 
variable), and this explains the presence of the rate However, the optimality of 
our procedure is not granted, and the rate A„ could presumably be improved in certain 
special situations. 

Lead-lag effect and arbitrage 

As stated, the lead-lag model for the two-dimensional process (X, Y) is not a semi- 
martingale, unless one component is appropriately shifted in time. This is not compatible 
in principle with the dominant theory of no-arbitrage models. This kind of modeling, 
however, seems to have some relevance in practice, and there is a natural way to reconcile 
both points of view. 

We focus, for example, on the simplest Bachelier model of Section 2.1. We show in this 
paper that the lead-lag parameter i9 can almost be identified in principle. Consequently, 
the knowledge of i? can then be incorporated into a trading strategy. If i? ^ 0, we can 
obtain, in principle, some statistical arbitrage, in the sense that we can find, in the 
Bachelier model without drift, a self financing portfolio of assets X and T-$(Y) with 
initial value zero and whose expectation at time T is positive. 

This statistical arbitrage can be erased by introducing further trading constraints such 
as a maximal trading frequency and transaction cost (slippage, execution risk and so 
on). In this setting, we can no longer guarantee a statistical arbitrage. Moreover, we may 
certainly incorporate risk constraints in order to define an admissible strategy. 

This outlines that although we perturb the semi-martingale classical approach, our 
lead-lag model is compatible in principle with non-statistical arbitrage constraints, under 
refined studies of risk profiles. We intend to set out, in detail, these possibilities in a 
forthcoming work. 

Micro structure noise 

Our model does not incorporate microstructure noise. This is reasonable if A„ is thought 
of on a daily basis, say (if T is of the order of a year or more say), but is inconsistent 
in a high-frequency setting where T is of the order of one day. In that context, efficient 
semi-martingale prices of the assets are subject to the so-called microstructure noise; 
see, among others, Zhang et at [28], Bandi and Russell [1], Barndorff- Nielsen et at [2], 
Hansen and Lunde [8], Jacod et at [17], Rosenbaum [24, 25]. In [21] and [22], Robert and 
Rosenbaum introduce a model (model with uncertainty zones) where the efficient semi- 
martingale prices of the assets can be estimated at some random times from the observed 
prices. In particular, it is proved that the usual Hayashi- Yoshida estimator is consistent 
in this microstructure noise context as soon as it is computed using the estimated values 
of the efficient prices. Using the same approach, that is, applying the lead-lag estimator 
to the estimated values of the efficient prices, one can presumably build an estimator 
which is robust to microstructure noise. 

How to use high-frequency data in practice 

Nevertheless, when high-frequency data are considered, we propose a simple pragmatic 
methodology that allows us to implement our lead-lag estimation procedure without 
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requiring the relatively involved data pre-processing suggested in the previous paragraph. 
A preliminary inspection of the signature plot in trading time - the realized volatility 
computed with different subsampling values for the trading times - enables us to select 
a coarse subgrid among the trading times where microstructure noise effects can be 
neglected. Thanks to the non-synchronous character of high-frequency data, we can take 
advantage of this subsampling in trading time and obtain accurate estimation of the 
lead-lag parameter, at a scale that is significantly smaller than the average mesh size of 
the coarse grid itself. This would not be possible with a regular subsampling in calendar, 
time where the price at time t would be defined as the last traded price before t. This 
empirical approach is developed in the numerical illustration Section 6 on real data, in 
the particular case of measuring lead-lag between the future contract on Dax (FDAX) 
and the Euro-Bund future contract (FGBL) with same maturities. 

Extension of the model 

We consider this work as a first - and relatively simple - attempt for modeling the 
lead-lag effect in continuous time models. As a natural extension, it would presumably 
be more reasonable to consider more intricate correlations between assets in the model. 
For example, one could add a common factor in the two assets, without lead-lag effect, 
as suggested by the empirical study of Section 6. Through this, and in addition to the 
"lead-lagged correlation," one would also obtain an instantaneous correlation between the 
assets. In order to estimate the lead-lag parameter in this context, one would presumably 
be required to consider local maxima of the contrast function we develop here. Such a 
development is again left out for future work. 

4. Proof of Theorem 1 

The proof of Theorem 1 is split in four parts. In the first three parts, we work under 
supplementary assumptions on the processes and the parameter space (Assumption A). 
We first show that if we compute the contrast function over points of the grid Q n such 
that the order of magnitude of \d n — 1?| is bigger than v n , then the contrast function goes 
to zero (Proposition 3). Then we prove that, on the contrary, if the order of magnitude of 
\& n — i9 1 is essentially smaller than v n , then the contrast function goes to the covariation 
between X and T-$(Y) (Proposition 4). We put these two results together in the third 
part which ends the proof of Theorem 1 under the supplementary assumptions. The proof 
under the initial assumptions is given in the last part. 

4.1. Preliminaries 

Supplementary assumptions 

For technical convenience, we will first prove Theorem 1 when the sign of # is known 
and when the components X and Y are local martingales. Moreover, we introduce a 
localization tool. The quadratic variation processes of X and Y admitting locally bounded 



14 



M. Hoffmann, M. Rosenbaum and N. Yoshida 



derivatives, there exists a sequence of stopping times tending almost surely to T + S such 
that the associated stopped processes are bounded by deterministic constants. Since 
Theorem 1 is a convergence in probability result, we can, without loss of generality, work 
under the supplementary assumption that the quadratic variation processes are bounded 
over [0,T + <5]. Therefore, we add-up the following restrictions: 

Assumption A. We have Assumption A and: 

Al. There exists L > such that (X)' T+S < L and (Y)' T , S < L. 

A2. The parameter set is restricted to •& = [0,5). Consequently, by Q n we mean here 

Q n n [0,5). 

A3. X = X C andY = Y C . 
Notation. We now introduce further notation. For / e 2" and J <G J ', let 
Zl = Z A mf jt, max{7' A t — If_f\t} > w„| AT 

and 

J2 = I Ainf [t, max {J 7 A t - £ A t} > w„| A (T + S). 
We define I n and J™ in the same way for I and J, respectively. Let I n = ( I n , I n ] and 

J" = ( JH, J™] . 

Remark 6. We have the following interpretation of Z!^ and I™ : let r™ denote the first 
time for which we know that an interval / will have a width that is larger than v n . Then 
we keep only the I_ and / that are smaller than r™. If r" < T, we also consider r" among 
the observation times. Note that r™ is not a true observation time in general. However, 
this will not be a problem since the set where A n is bigger than v n will be asymptotically 
negligible. Obviously ZH and I n are F-stopping times, and J^_ and J n are F" 5 " 1 ""™ -stopping- 
times. 

Finally, for two intervals H = (H_, H] and H ' = (Hf_, H'} , we define 

K(H,H'):=l HnH ^ . 

4.2. The contrast function 

We consider here the case where the order of magnitude of |$ n — $| is bigger than v n . 
We first need to give a preliminary lemma that will ensure that the quantities we will 
use in the following are well defined. 

Lemma 1. Work under Assumption B2, under the slightly more general assumption that 
for all I = (Z, I] S I, the random variables I_ and I are ¥ -stopping times. Suppose that 
$ > ?? + s n and 2v n < e n . Then for any random variable X' measurable w.r.t. T—, the 
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random variable X'K(I~,J n ) is J-j„ -measurable. In particular, f(I n )X(I n )K(I~,J n ) is 
J- jn -measurable for any measurable function f . 

The proof of Lemma 1 is given in the Appendix. It is important to note that Lemma 1 
implies that for $ > i9 + e n and 2v n < e n , the random variable 

l { ^ T} X(I n )K(IlJ n )lj4s) 

is -Ff-measurable. Indeed, lj«(s) is and lj™(s) = 1 implies s > J2_. We now introduce 
a functional version of U n by considering the random process 

U"W*:= E X(I n )Y(J n ) t K(IlJ n ). 

IEX,.J£J,~<T 

We are now able to give the main proposition for the vanishing of the contrast function. 

Proposition 3. Lete n =2v n , Q\ = {d e Q n ,d > ■# + £„} andQ n _ = {d e £",#<'!?-£„}. 
We have 

_max \H n (d) T +s\ ->0, 

in probability. 

Proof. Assume first ■& > ■& + e„ . Thanks to Lemma 1, we obtain a martingale represen- 
tation of the process U™ (#) that takes the form 

ieijeJ Jo 

where the stochastic integral with respect to Y is taken for the filtration F . As a result, 
the F^-quadratic variation of U™ is given by 

(U n W)t=/Y E l^< T} X{I n )K{IlJ n )lj^s)\ d(Y) s . 

Using that the intervals J n are disjoint, we obtain 

(U B W)t= f Y.(Y. l {^<T } X(I n )K{I^J n )) ljn{s)A{Y) 3 . 
JeJ^iex ' 

For a given interval J™ , the union of the intervals I n that have a non-empty intersection 
with J™ is an interval of width smaller than 3w„ . Indeed, the maximum width of J n is 
v n and add to this (if it exists) the width of the interval I n such that PJ_< J n , I n > J n 
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and the width of the interval I n such that r_ < J^, 7™ > . Thus, 
^l {7 ^< T} X(/ n )if(/2,J")<sup sup \X (S+U)AT - X u \ 



lex 



s<T 0<u<3v„ 

<2 max sup \X t/ , T - X 3 v r k -i) I 



Consequently, we obtain for every t G [0, T + <5] and $ G [i? + e„ , <5] , 

(U n (tf)) t <4L(T + 5) max sup |X tAT - X 3[k _ 1>n \ 

l<k<[(3v n )-iT} te[3Vn ( k _ 1)i3vnk] 

For every p > 1, it follows from the Burkholder-Davis-Gundy inequality that 



E[|U™(^) T +i| 2p ] < E 
fe=i 



SUp \X tA T — ^3(fc-l)«„ I 

te[3u Il (fe-l),3w„fe] 



2p 



< yP' 1 



where the symbol < means inequality in order, up to constant that does not depend 
on n. Pick e > 0. We derive 

max|U"(^) T+5 |>el < £ E[|U" ($)t+s\ 2p } 



as n — > oo, provided p > 7 + 1 where 7 is defined in Assumption B3, a choice that is 
obviously possible. The same argument holds for the case $ < $ — e„, but with an A- 
integral representation in that latter case. The result follows. □ 



4.3. Stability of the HY estimator 

We consider now the case where the order of magnitude of 1 19„ — 19 is essentially smaller 
than v n . We have the following proposition: 

Proposition 4. Work under Assumptions A and B. For any sequence i?„ in [0,6) such 
that $ n <$ and \d n — ~d\ < p n (remember that p n is defined in Assumption B3,), we have 

U n {$ n )^{X,T-o{Y)) %T] , (11) 

in probability as n — > 00. 

Proof. The proof goes into several steps. 

Step 1. In this step, we show that our contrast function can be regarded as the Hayashi- 
Yoshida estimator applied to X and to the properly shifted values of Y plus a remainder 
term. If ■& = 0, then i?„ = 0, and Proposition 4 asserts nothing but the consistency of 
the standard HY-estimator; see Hayashi and Yoshida [11] and Hayashi and Kusuoka [9]. 
Thus we may assume $ > 0. 
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By symmetry, we only need to consider the case where 

Set S n =d n -$,Y t = T-${Y) t and J" = J% and 

lT(d n ) = J2 X(nY(J n )l {InnJ ^ 0} . 

IEI,J£JJ<T 



We then have 



iex,Jej,i<T 
This can be written V™ + TZ n with 

/ez,Je.7./<T 

Remark that Y(J ra ) and Y(</™ 5 ) are well defined since Y is defined on [— #,T] and 

i? n < 0. For every J G J, IF 1 is a F^"™ -stopping time; therefore J™^ = J™^ is a F 1 '"-' 5 " - 
stopping time, and a F-stopping time as well. Thus V" is a variant of the HY-cstimator: 
more precisely, 

V»:= £ ^a") ? (^„nR + )l {/ „ n(J „ 5nnR+)#0} 

7,J:J<T 

is the original HY-estimator, and we have V" — V" — ► in probability as n — ► 00. It 
follows that V" — > (X,Y)t in probability as n — ) 00; see [11] and [9]. 

Step 2. Before turning to the term TZ n , we give a technical lemma and explain a 
simplifying procedure. For an interval / = I) G X, set 



M 1 := sup{J™ K ,J G J, J r l Sn < /"}. 

Note that if we consider the interval J at the extreme left end of the family J", we have, 
for large enough n, 

say, so we may assume that the set over which we take the supremum is non-empty. 

Lemma 2. Work under Assumption B2. The random variables M 1 are ¥ -stopping 
times. 
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The proof of this lemma is given in the Appendix. We now use a simplifying operation. 
For each I n , we merge all the J™ such that J" 5 C I" . We call this procedure II- reduction. 
The II-reduction produces a new sequence of increasing random intervals extracted from 
the original sequence (J" 5 ), which are F-predictable by Lemma 2. More precisely, the 
end-points are F-stopping times. It is important to remark that the II-reduction implies 
that there are at most two points of type J between any and J™. Moreover, since 
lZ n is a bilinear form of the increments of X and Y , it is invariant under II-reduction. 
Likewise for the maximum length A„. Thus, without loss of generality, we may assume 
that the J™ s are II-reduced. 

Step 3. We now turn to lZ n . We write 

I n (J-sJ= (J r - 

iei.7<T,{i™nJ" Sn ^0} 

We have 

i* n i<£ \nn-nj- Sn )\\x(i n (j^ 5n ))\. 

We now index the intervals J" by j and set J" = {0} if j > ^{J}- Thus, the preceding 
line can be written 

m<^\Y{J^)-Y(J2 Sn j)\\X(r(jt Sn)j ))\. 
j 

Then the Cauchy-Schwarz inequality gives that (E[|7?™|]) 2 is smaller than 

3 3 

We easily get that 

Y,n\nj?)-Y(j^ n ,j)\ 2 }<w, 

3 

and we claim that (see next step) 

£E[|X(/"(J!! 5nJ ))l 2 ]<i- (12) 

3 

Since S n < p n , Proposition 4 readily follows. 

Step 4- It remains to prove (12). Here we extend (X t )tm + a,s X s = for s < 0, and 
denote the extended one by the same "X." This extension is just for notational conve- 
nience, and causes no problem because, in what follows, we use the martingale property 
of X only over the time interval R+. For ease of notation, we also stop writing the in- 
dex j for the intervals. We begin with the following remark. Take an interval J™^ , say 

(Ji,J2] an d I n (J™$ ) associated, say (ii,^]- Call Jo the last observation point of type 
J occurring before J\ and J_i the last observation point of type J occurring before Jq. 
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Two situations are possible: 

- If there is no observation point of type / between I\ and I2, then, if it exists, Jo is 
necessarily before I\. If it does not exist, we have J\ <v n . 

- If there are some observation points of type I between I\ and I2, then Jo might 
also be between I\ and li - However, thanks to the II- reduction, we know that J_i is 
necessarily smaller than I\. Consequently, we have that \X(I n ( J" s ))| is smaller than 



sup 



\X, 



te[J" 



sup 



sup 



Xjn 



where we used the following notation: 

- I" is the first interval I n such that I n exits to the right of J" 5 . 

- J-^ 1 denotes the interval of the form J™ s which is the nearest neighbor to J™ 5 
on the left. 

- J-'s~ 2 denotes the interval of the form J™$ n which is the nearest neighbor to J™^ -1 
on the left. 

- J™' -1 is the first exit time to the right of J™^ 1 among the I n . 

- I n '~ 2 is the first exit time to the right of J"^ -2 among the 



n, — k 



- For k = l,2, if J_ s 
Hence we obtain 



is not denned, sup te jj„,-fc Jn „ fc j \X t — Xjn,-k \ = 0. 



^E[|X(/"(J!! 5nJ ))| 2 ]<E E f SU P_ \ x t~ x j 



and so ^ - E[|X(/"(J™ i5 7 ))| 2 ] can be bound in order by 



sup 



te[J™ s .Ji 



\Xt~X, 



E sup_ \X t 

1 *e[/?,/?] 



X, 



Thanks to the n-reduction, we know that a given interval of the form (/",/"] can be 
associated to, at most, two values of type J. Thus the second term of the preceding 
quantity is smaller than 



2E 



E SUP_ \X t - Xln}li<$Z 



where i is an indexing of the intervals [/"./"). Note that each 7™ is an F-stopping time 
as it is the maximum among all 



J" < J 1 



<5„> 



together with a strong predictability property; see Lemma 2 for a similar statement. So, 
using Biirkholdcr-Davis-Gundy inequality, (12) is proved and Proposition 4 follows. □ 
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4.4. Completion of proof of Theorem 1 under Assumption A 

Write A = {(X,T-#(Y)) T ^0}. By Assumption B3, wc have 

Therefore, there exists a sequence -d n in Q n such that < $ and |$ n — 1?| < 2p„. For 
sufficiently large n, we have p n < e n — 2v n . Moreover, on the event A, 

U n {& n ) > sup \U n {d)\ 
implies |z9„ — i9| < e n . It follows that 

P[{|^n-*|>£n}n^]<P[{_ Sup \U n 0)\>\U n {-d n )\}nA 



Let e > 0. For large enough n, the probability to have A ra smaller than v n is larger than 
1 — £ and, consequently, 

P[{|f?n-i?|>en}n^]<p[{_ sup \v n 0) T+5 \>\u n (K)\}nA 



This can be bounded in order by 



\U n (K)\<l\(X,r^(Y)) T \ 



sup \W t (0) T+ t\>h(X,T-4(y)) T \}nA 



and this last quantity converges to £ as n — > oo by applying Proposition 3 and Proposi- 
tion 4. 



4.5. The case with drifts 

We now give the proof of Theorem launder Assumptions Al, A2 and B. The contrast 
U n 0) admits the decomposition = U n 0) + K n (i)) with 

W0)= J2 X c (I)Y c (J)l {In .j_^0 } 
iei,.ieJj<T 

and 

H n 0)= J2 (X(I)B(J)+A(I)Y c (J))l {InJ _^ 0} . 
iex,.Jej 

For a function t — > Z t defined on the interval H , introduce the modulus of continuity 
u>z(a, H) = sup{|Z t - Z s \,s,t€ H,\s-t\ <a}, a> 0. 
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We have 

sup \K n 0)\ <w x (3A n ,[0,T}) sup \B t \ + w Y o(3A n , [0,T + 5]) sup \A t \, 
5e[o,<5) t£[o,T+s] te[o,T] 

and this term goes to in probability as n — > oo. 

Finally, the result is obtained in a similar way as in the no-drift case, using (X C ,Y C ) 
in place of (X, Y) . 

4.6. The case where -d £ (—5,6) 

We now give the proof of Theorem 1 under Assumptions Al and B. Even in the case 
where •& is negative, Proposition 3 is still in force, and we obtain 

sup \U n (d)\^Q 

i?eS"n[o,(5) 

in probability as n — > oo. The result follows from Remark 3. 

5. A numerical illustration on simulated data 

5.1. Synchronous data: Methodology 

We first superficially analyze the performances of on a simulated lead-lag Bachelier 
model without drift. More specifically, we take a random process (X, t_$(Y)) following 
the representation given in (2) in Section 2.1, having 

T=l, (5 = 1, i? = 0.1, x = yo = 0, <7i = (T 2 = l. 

In this simple model, we consider again synchronous, equispaced data with period A„ 
and correlation parameters p. In that very simple model, we construct with a grid Q n 
with equidistant points with mesh 2 h n = A„. We consider the following variations: 

1. Mesh size: h n G {10~ 3 , 3.10~ 3 , 6.10~ 3 }. 

2. Correlation value: p e {0.25, 0.5, 0.75}. 

5.2. Synchronous data: Estimation results and their analysis 

We repeat 300 simulations of the experiment and compute the value of i9„ each time, 
the true value being -d = 0.1, letting p vary in {0.25,0.5,0.75}. We adopt the following 
terminology: 

2 Note that, strictly speaking, such grid is not fine enough in order to fulfill our assumptions. However, 
the contrast function is constant over all the points of a given interval (fcA n , (k + l)A n ), k S Z, and its 
value is just the sum of the values obtained for the shifts kA n and (k + l)A n . 
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1. The fine grid estimation (abbreviated FG) with h n = 10~ 3 . 

2. The moderate grid estimation (abbreviated MG) with h n = 3.10 -3 . 

3. The coarse grid estimation (abbreviated CG) with h n =6.1CP 3 . 

The estimation results are displayed in Table 1 below. With no surprise, for a given mesh 
h n , the difficulty of the estimation problem increases as p decreases. 

In the fine grid approximation case (FG) with mesh h n = 10~ 3 , the lead-lag parameter 
■d belongs to Q n exactly. Therefore, the contrast U n (■&) is close to for all values # &Q n , 
except perhaps for the exact value ■& = This is illustrated in Figure 1 and Figure 2 
below, where we display the values or W ra ($ n ). Note how more scattered are the values 
of U n ( r d n ) for p = 0.25 compared to p — 0.75. This is, of course, no surprise. 




Figure 1. Fine grid case (FG). Over one simulation: displayed values of |W"($)| for $ £ Q n 
with mesh h n = 10~ 3 and p — 0.75. The value max^ gg „ is well located. 
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Figure 2. Same setting as in Figure 1 for p = 0.25. The value max^ g g„ |W n ($)| is still correctly 
located. 

For the moderate grid (MG) and the coarse grid (CH) cases, the lead-lag parameter 
$ ^ Q n . Hence, lA n (j)) is close to for almost all values of Q n except but two. When p is 
small, the statistical error in the estimation of p is such that | max^ g g„ W n (i?)| is not well 



. u o o" 

° 



2. 



Figure 3. Moderate grid case (MG). Over one simulation: displayed values of |W n (i9) | for $ £ Q n 
with mesh h n = 10 -3 and p = 0.75. The value max^ gg „ |W n ($)| is still well located. We begin 
to see the effect of the maximization over a grid Q n which does not match exactly with the true 
value ■& with the appearance of a second maximum. 
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Figure 4. Same setting as in Figure 3 for p = 0.25. The value max^ g g„ |W n ($)| is still correctly 
located, but the overall shape of |W n ($)| deteriorates. 



located anymore. The error in the estimation can then be substantial, but is nevertheless 
consistent with our convergence result. This is illustrated in Figures 3 to 8 below. 

When p decreases or when the mesh h n of the grid increases, the performance of $ n 
deteriorates, as shown in Figures 7 and 8 below. 



' o cp 



o o ° °° 
° o o 



lag 



Figure 5. Coarse grid case (CG). Over one simulation: displayed values of for $ £ Q n 

with mesh h n = 10 -3 and p = 0.75. The value max^ggn is still well located. The fact 

that Q n does not match & appears more clearly than in Figure 3. 
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Figure 6. Same setting as in Figure 5 for p — 0.25. The value maxj: g5 „ |W n (i?)| is no longer 
correctly located. 



5.3. Non-synchronous data 

We randomly pick 300 sampling times for X over [0, 1] uniformly over a grid of mesh size 
10~ 3 . Wc randomly pick 300 sampling times for Y likewise, and independently of the 
sampling for X. The data generating process is the same as in Section 5.1. In Table 2, we 
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Figure 7. Moderate grid case (MG). Histogram of the values of # n with true value t9 = 0.1 over 
300 simulations for p — 0.25. 
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Figure 8. Coarse grid case (CG). Histogram of the values of # n with true value i? = 0.1 over 
300 simulations for p = 0.25. 

display the estimation results for 300 simulations, in the fine gird case (FG) with •& = 0.1 
and p = 0.75. 

The histograms for the case p = 0.5 and p = 0.25 are displayed in Figures 9 and 10. 

6. A numerical illustration on real data 
6.1. The data set 

We study here the lead-lag relationship between the following two financial assets: 

- The future contract on the DAX index (FDAX for short), with maturity December 
2010. 

- The Euro-Bund future contract (Bund for short), with maturity December 2010, 
which is an interest rate product based on a notional long-term debt instruments 
issued by the Federal Republic of Germany 

These two assets are electronically traded on the EUREX market, and are known 
to be highly liquid. Our data set has been provided by the company QuantHouse EU- 



Table 2. Estimation of ■& — 0.1 on 300 simulated samples for p — 0.75 and non-synchronous 
data 
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Figure 9. Fine grid case (FG), non-synchronous data. Histogram of the values of with true 
value ■& = 0.1 over 300 simulations for p = 0.5. 
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Figure 10. Fine grid case (FG), non-synchronous data. Histogram of the values of with true 
value i? = 0.1 over 300 simulations for p — 0.25. The performances of i? n clearly deteriorates as 
compared to Figure 9. 



ROPE/ASIA 3 . It consists in all the trades for 20 days of October 2010. Each trading day 
starts at 8.00 am CET and finishes at 22.00 CET, and the accuracy in the timestamp 
values is one millisecond. 

J http: //www. quanthouse . com. 
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Figure 11. Signature plot for the Bund (left) and the FDAX (right) for 2010, October 13. 



6.2. Methodology: A one day analysis 

In order to explain our methodology, we take the example of a representative day: 2010, 
October 13. 

Micro structure noise 

Since high-frequency data are concerned, we need to incorporate microstructure noise 
effects, at least at an empirical level. A classical way to study the intensity of the mi- 
crostructure noise is to draw the signature plot (here in trading time). The signature plot 
is a function from N to M + . To a given integer k, it associates the sum of the squared 
increments of the traded price (the realized volatility) when only 1 trade out of k is 
considered for computing the traded price. If the price were coming from a continuous- 
time semi-martingale, the signature plot should be approximately flat. In practice, it is 
decreasing, as shown by Figure 11. 

According to Figure 11, for all our considered day, we subsample our data so that we 
keep one trade out of 20. On 2010, October 13, after subsampling, it remains 2018 trades 
for the Bund and 3037 trades for the FDAX. 

Construction of the contrast function 

The second step is to compute our contrast function. Here the Bund plays the role of 
X and the FDAX the role of Y . Therefore, if the estimated value is positive, it means 
that the Bund is the leader asset and the FDAX the lagger asset, and conversely. To 
have a first idea of the lead-lag value, we consider our contrast function for a time shift 
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Figure 12. The function U n for 2010, October 13, time shift values between —10 minutes and 
10 minutes, on a grid with mesh 30 seconds. The contrast is obtained by taking the absolute 
value of U n . 



between —10 minutes and 10 minutes, on a grid with mesh 30 seconds. The result of this 
computation for October 2010, 13 is given in Figure 12. 

From Figure 12, we see that the lead-lag value is close to zero. Thus, we then compute 
the contrast function for a time shift between —5 seconds and 5 seconds, on a grid with 
mesh 0.1 second. The result of this computation for 2010, October 13 is given in Figure 13. 

From Figure 13, we can conclude that on 2010, October 13, the FDAX seems to lead 
the Bund, with a small lead lag value of —0.8 second. 

6.3. Systematic results over a one-month period 

We now give, in Figure 14, the results for all the days of October 2010. 

The results of Figure 14 seem to indicate that, on average, the FDAX tends to lead the 
Bund. Indeed, the estimated lead-lag values are systematically negative. Of course these 
results have to be taken with care since the estimated values are relatively small (the 
order of one second); however, dealing with highly traded assets on electronic markets, the 
order of magnitude of the lead-lag values that we find are no surprise and are consistent 
with common knowledge. A possible interpretation - yet speculative at the exploratory 
level intended here - for the presence of such lead-lag effects is the difference between 
the tick sizes of the different assets. Indeed, the negative values could mean that the tick 
size of the FDAX can be considered smaller than those of the Bund. 
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Figure 13. The function U n for 2010, October 13, time shift values between —5 seconds and 5 
seconds, on a grid with mesh 0.1 second. The contrast is obtained by taking the absolute value 
of W. 



Appendix 

A.l. Proof of Proposition 1 

For notational clarity, for a given interval I = (/, I], we may sometimes write X(I_, I) in- 
stead of X(I) when no confusion is possible. In the Bachelier case with lead-lag parameter 
$ £ 0, we work with the following explicit representation of the observation process: 

f X t = x + a±B t , 

\Y t =yo + a 2 (pB t _# + ^1- p 2 W t -#), 1 ' 

where B and W are two independent Brownian motions. We have 

U n 0)= ]T X((i-l)A n ,iA n )r_ i Y((i-l)A n ,iA n )=a 1 a 2 £ x ?&), 

0<iA„<T 0<iA„<T 

with 



X?W = B((i - l)A n ,iA n ){pr^B((i - l)A n ,iA„) + y/l - p*T#_$W((i - l)A n ,iA„)]. 
We have 

E[x?(0)] = pnB((i - \)A n ,iA n )T^B{{i - l)A n ,iA n )} = P AMK\V - *))> 
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Figure 14. Estimated lead-lag values for October 2010. 



where f(x) = (1 — |aj|)l|a;|<i is the usual hat function. Assuming further, with no loss of 
generality, that T/A„ is an integer, we obtain the representation 

^ 0<iA„<T ' 

We now assume without loss of generality that < — i? < A„ (the symmetric case being 
treated the same way). The sequence of random variables Xi(&) i« stationary. Moreover, 
since the random variable x™("$) involves increments of W and B over a domain included 
in \(i — 2)A„, iA n ] because \& — 19\ < A„, it follows that x™(^) an d Xj($) are independent 
as soon as \i — j\ > 2. Moreover, we claim that 

Cov( X rW,x"W)=0 if|i-i| = l. (14) 

Therefore, by the central limit theorem, we have that 

A i/2 T -i/ 2 (x?w-nxnm 

0<iA„<T 
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is approximately centred Gaussian, with variance 

Var (*?(£)). 

Computation of Var(x™(0)) 

To that end, we need to evaluate 

I = p 2 E[(B(0,A n ) v ^(0,A n )) 2 ], 

and 

II = (1 - p 2 )E[(B(0, A n )r 4 _ 3 W(0, A„)) 2 ], 
since B(0, A n )r^_^B(0, A n ) and -8(0, A„)t i5 _ i jVI / (0, A„) are uncorrelated. Writing 

B{0,A n )T#_#B(0,A n ) 
= {B(0, - 0) + B(0 - 0, A„))(B(0 - 0, A») + S(A„, -0 + A„)), 

taking square and expectation, we readily obtain that 

I = 2p 2 (tf - 0)(A„ - (0 - 0)) + p 2 (tf - tf) 2 + 3p 2 (A„ - (0 - d)) 2 
= p 2 (A 2 n (l + 2 ( p(A- 1 (#-#)) 2 )). 

Concerning II, since B and W arc independent, we readily have 

II = (l-p 2 )A 2 , 

therefore, from E[x™ (0)] = pA n tf(A^ n 1 (d — •&)), we finally infer 

A^Varfc"^)) = I + p 2 ^ 1 (V - d)f 

from which Proposition 1 follows. It remains to prove (14). By stationarity, this amounts 
to evaluate 

P 2 E[B(Q, A n )B(0 - 0, A n + - 0)-B(A„, 2A n )B(A n + - 0, 2A„ + - 0)] - E[x? (0)] 2 - 

To that end, we split each of the terms as follows: 

B(0, A„) = 5(0, - 0) + B(0 - 0, A„), 
B(0 - 0, A„ + d - 0) = B(0 - 0, A„) + B(A n + 0-0), 

B(A„, 2A„) = S(A„, A n + - 0) + S(A„ +0-0, 2A„), 
B(A n +0-0, 2A„ + - 0) = fl(A„ +0-0, 2A n ) + B(2A„, 2A„ +0-0). 
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Using the stochastic independence of each of these terms, multiplying and integrating, 
we easily obtain 

P 2 E[B(0, A„)B(0 - 0, A„ + 7? - 0)B(A„, 2A„)B(A„ +0-0, 2A„ +0-0)] 
= p 2 A, 2 ^(A- 1 (0-0)) 2 -E[x5 l (0)] 2 . 



A. 2. Proof of Proposition 2 

Suppose that A„(0„ — 0) —> Z, in law, for some random random variable Z. For a G M, 
we write o)- n \ the best approximation of a by a point of the form kA n , k G Z and c^"] , 
the best approximation of a by a point smaller or equal to a and of the form fcA„, fceZ. 
We have 

A" 1 ^ - 0) = A- x (0„ - 0L" J ) + A- 1 ^ - 0). 

The first term in the right-hand side of the equality is smaller than A" 1 /)^ and so 
converges to zero. The second term can be written as 

A- x (0>J - 0N) + A^(0 N - 0) = T hn + T 2 , n , 

say. The sequence T\ iTl is a random sequence of integers, and T2 iU is a deterministic 
sequence with values in [0, 1/2] which does not converge. Let i/j n be a subsequence such 
that T2 $ n — > I with I G (0, 1/2]. Then T\^ n converges in law to Z — I which implies that 
the support of Z is included in {z + 1, z G Z}. Consider now ip n such that T 2 t — ^ Z' with 
Z' G [0, 1/2], Z' 7^ Z. In the same way, we get that the support of Z is also included in 
{z + I', z G Z}, a contradiction. 



A. 3. Proof of Lemma 1 

Preliminary results 

We first prove the following results. 

Lemma 3. Work under Assumption B2, under the slightly more general assumption 
that for all I = [I, I) G I, the random variables J and I are ¥ -stopping times. 

(a) If > + v n , then for any ¥ -stopping time a and t G R+ . o + is an F +" n - 
stopping time. In particular, the random variables I~ and I~ are F^"- -stopping 

times. 

(b) For each J G J , we have 
ancZ /or eacft J el, 

J- -p'S + f n 
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(c) Suppose that '&>•& + s n and 2v n < e n . Then for any random variable X' mea- 
surable w.r.t. J-—, the random variables X'1^ I „ < — j and X'l^ I ^ < —y are Jjn- 

measurable. 



Proof. Proof of (a) . For any F-stopping time a and t £ 



{a + ■& < t} = {a < t - >&} = {a < (t - (tf - d - v n )) v n } 

Proof of (b). Note first that under Assumption B2, the F tf+ Vn -stopping time is in 
particular an F^-stopping time; thus J-~j„ is a a-field. Moreover, since J n and ±P_ + v n 

are F 1?+ "™ -stopping times by definition, both J--^ v " and T are cr-fields, and also 
the inclusion is trivial from J™ < Jff_ + v n . To obtain the equality, it suffices to observe 
that each of the conditions "A € J-J^™ " and "A € Fjn " is equivalent to the condition 

Ar\{r<t-v n }eFf_ Vn 

for all t £ R+ . The second equality is proved in the same way. 

Proof of (c). Since J" and I~ are V^ +Vn -stopping times by assumption, we have 

the last inclusion following from (b). If I~ < J™, then 

I" < I n + V„ <7™ - + V n < J™ -1? - V n , 



which implies I$ +v < J n - Thus 

x ' l {i~<j^} =x ' 1 {T^Z ) <^} x hi%<j*r 



We have that X 1 is measurable with respect to T— = T'^ Vn — . Also Ify , ■> is a stopping 

time with respect to F^"™ by (a). Consequently, X'lrjn <tst> is J^ Vn -measurable, 

hence T Jn -measurable. Eventually, X'1^ I „ < — j is J 7 ^, -measurable. The other statement 
is proved the same way. □ 

Proof of Lemma 1. We have 

X'K(I%,J n ) =^'l{/|<^ L }l { jn < I| } +X'1 { J|. > J !1} 1 { — >7 „ } . 

Since ?? > i? + e n > i? + v ni both 1~ and J~ are ¥^ +Vn -stopping times. Therefore, the 

second term on the right-hand side of the above equality is J-j„ -measurable by (c) of 
Lemma 3. 
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Now we notice that if I~ < J" then I~ < J™, therefore 
1? ' — 

x ' 1 {^<J^} 1 {ri<m = i x ' 1 {i'!<—}) x ( 1 {i$<£L} 1 {J"<m)- 

The first factor on the right-hand side of the above equality is J-j„ -measurable by (c) 
of Lemma 3, and the second factor is obviously J-j n -measurable. This completes the 
proof. □ 



A. 4. Proof of Lemma 2 



Let us fix I s X. Let 

J^-Vn on{>^>7^}, 
\j n s n on {J" 5n <!"}. 

We know that I n — v n is an F-stopping time by Assumption B2, and also that J™ g — v n 
is an F-stopping time due to S„ < 0. Let us show first that the Tjs are F-stopping times. 
Let t e [S,T + S]. Let 

Al = (F 1 - V n < t, J^ 5n - V n > ~F - V n } 

and 

A 2 = {J n s n < t, J!! 5n -Vn<I"- v n }. 
It is obvious that A\ £ J~t since I n — v n is an F-stopping time and also 

{ > 5n - Vn >7^-V n }E J~_ Vn . 

For the term A2, if t € [—5, — <5 + w n ], then ^2 = G ^-5 C J 7 *. Otherwise, if t 6 (— <J + 
u„ , T + <5] , then 

■A 2 = - V n < t - V n , J^ Sn - V n < J™ — W,J € .F t _„„ C JF t . 

Eventually, we have {Tj < t] E Tt\ hence Tj is an F-stopping time. 

In conclusion, there exists at least one J" 5 in [I n — v n , I n ]. Therefore, we have M 1 = 
sup j Tj, and this implies that M 1 is also an F-stopping time. 
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